智能论文笔记

"Stop Asian Hate!" : Refining Detection of Anti-Asian Hate Speech During the COVID-19 Pandemic

Huy Nghiem , Fred Morstatter

分类：自然语言处理

2021-12-04

*内容警告：此工作显示明确和强烈令人反感的语言的示例。 Covid-19大流行引起了抗亚洲仇外心理和偏见的激增。许多人已经向社交媒体表达了这些负面情绪，需要开发可靠的系统来检测仇恨言论，往往是代表性的人口统计。在本文中，我们使用2种实验方法创建和注释推特推文的语料库，以探讨较好的粒度的反亚洲滥用和仇恨言论。使用具有较少偏置注释的数据集，我们部署多种模型，并检查其他相关的语料库的适用性来完成这些多任务分类。除了展示有希望的结果外，我们的实验还提供了对文化和后勤因素的差别，以了解不同人口统计学的讨厌讲话。我们的分析旨在促进对仇恨语音检测领域的理解，特别是对低资源群体。

translated by 谷歌翻译

A Machine Learning Case Study for AI-empowered echocardiography of Intensive Care Unit Patients in low- and middle-income countries

Xochicale Miguel , Thwaites Louise , Yacoub Sophie , Pisani Luigi , Tran Huy Nhat Phung , Kerdegari Hamideh , King Andrew , Gomez Alberto

分类：机器学习

2022-12-30

We present a Machine Learning (ML) study case to illustrate the challenges of clinical translation for a real-time AI-empowered echocardiography system with data of ICU patients in LMICs. Such ML case study includes data preparation, curation and labelling from 2D Ultrasound videos of 31 ICU patients in LMICs and model selection, validation and deployment of three thinner neural networks to classify apical four-chamber view. Results of the ML heuristics showed the promising implementation, validation and application of thinner networks to classify 4CV with limited datasets. We conclude this work mentioning the need for (a) datasets to improve diversity of demographics, diseases, and (b) the need of further investigations of thinner models to be run and implemented in low-cost hardware to be clinically translated in the ICU in LMICs. The code and other resources to reproduce this work are available at https://github.com/vital-ultrasound/ai-assisted-echocardiography-for-low-resource-countries.

translated by 谷歌翻译

Face Forgery Detection Based on Facial Region Displacement Trajectory Series

YuYang Sun , ZhiYong Zhang , Isao Echizen , Huy H. Nguyen , ChangZhen Qiu , Lu Sun

分类：计算机视觉 | 人工智能

2022-12-07

Deep-learning-based technologies such as deepfakes ones have been attracting widespread attention in both society and academia, particularly ones used to synthesize forged face images. These automatic and professional-skill-free face manipulation technologies can be used to replace the face in an original image or video with any target object while maintaining the expression and demeanor. Since human faces are closely related to identity characteristics, maliciously disseminated identity manipulated videos could trigger a crisis of public trust in the media and could even have serious political, social, and legal implications. To effectively detect manipulated videos, we focus on the position offset in the face blending process, resulting from the forced affine transformation of the normalized forged face. We introduce a method for detecting manipulated videos that is based on the trajectory of the facial region displacement. Specifically, we develop a virtual-anchor-based method for extracting the facial trajectory, which can robustly represent displacement information. This information was used to construct a network for exposing multidimensional artifacts in the trajectory sequences of manipulated videos that is based on dual-stream spatial-temporal graph attention and a gated recurrent unit backbone. Testing of our method on various manipulation datasets demonstrated that its accuracy and generalization ability is competitive with that of the leading detection methods.

translated by 谷歌翻译

CSTAR: Towards Compact and STructured Deep Neural Networks with Adversarial Robustness

Huy Phan , Miao Yin , Yang Sui , Bo Yuan , Saman Zonouz

分类：计算机视觉

2022-12-04

Model compression and model defense for deep neural networks (DNNs) have been extensively and individually studied. Considering the co-importance of model compactness and robustness in practical applications, several prior works have explored to improve the adversarial robustness of the sparse neural networks. However, the structured sparse models obtained by the exiting works suffer severe performance degradation for both benign and robust accuracy, thereby causing a challenging dilemma between robustness and structuredness of the compact DNNs. To address this problem, in this paper, we propose CSTAR, an efficient solution that can simultaneously impose the low-rankness-based Compactness, high STructuredness and high Adversarial Robustness on the target DNN models. By formulating the low-rankness and robustness requirement within the same framework and globally determining the ranks, the compressed DNNs can simultaneously achieve high compression performance and strong adversarial robustness. Evaluations for various DNN models on different datasets demonstrate the effectiveness of CSTAR. Compared with the state-of-the-art robust structured pruning methods, CSTAR shows consistently better performance. For instance, when compressing ResNet-18 on CIFAR-10, CSTAR can achieve up to 20.07% and 11.91% improvement for benign accuracy and robust accuracy, respectively. For compressing ResNet-18 with 16x compression ratio on Imagenet, CSTAR can obtain 8.58% benign accuracy gain and 4.27% robust accuracy gain compared to the existing robust structured pruning method.

translated by 谷歌翻译

Hierarchical Sliced Wasserstein Distance

Khai Nguyen , Tongzheng Ren , Huy Nguyen , Litu Rout , Tan Nguyen , Nhat Ho

分类： (统计)机器学习 | 机器学习

2022-09-27

切成薄片的Wasserstein（SW）距离已在不同的应用程序场景中广泛使用，因为它可以缩放到大量的支撑量，而不会受到维数的诅咒。切成薄片的瓦斯坦距离的值是通过radon变换（RT）获得的原始度量的一维表示（投影）之间运输成本的平均值。尽管估计切成薄片的瓦斯坦族的支持效率，但仍需要在高维环境中进行相对较大的预测。因此，对于与维度相比，支撑次数相对较少的应用，例如，使用微型批量方法的几个深度学习应用，radon transform的矩阵乘法中的复杂性成为主要计算瓶颈。为了解决这个问题，我们建议通过线性和随机组合少量的预测来得出预测，这些预测被称为瓶颈预测。我们通过引入层次ra transform（HRT）来解释这些投影的用法，该层rad rad transform（HRT）是通过递归应用radon变换变体构建的。然后，我们将方法制定为措施之间的新指标，该指标命名为分层切片瓦斯坦（HSW）距离。通过证明HRT的注入性，我们得出了HSW的指标。此外，我们研究了HSW的理论特性，包括其与SW变体的联系及其计算和样品复杂性。最后，我们将HSW的计算成本和生成质量与常规SW进行比较，使用包括CIFAR10，Celeba和Tiny Imagenet在内的各种基准数据集进行深层生成建模的任务。

translated by 谷歌翻译

An Efficient Algorithm for Fair Multi-Agent Multi-Armed Bandit with Low Regret

Matthew Jones , Huy Lê Nguyen , Thy Nguyen

分类：机器学习

2022-09-23

最近，提出了经典多军强盗的多代理变体来解决在线学习中的公平问题。受社会选择和经济学方面的长期工作的启发，目标是优化NASH的社会福利，而不是全面的效用。不幸的是，就回合$ t $的数量而言，以前的算法要么不是有效的，要么实现次级遗憾。我们提出了一种新的有效算法，其遗憾也比以前效率低下的算法要低。对于$ n $ agents，$ k $ ands和$ t $ rounds，我们的方法遗憾的是$ \ tilde {o}（\ sqrt {nkt} + nk）$。这是对先前方法的改进，后者对$ \ tilde {o}（\ min（nk，\ sqrt {n} k^{3/2}）\ sqrt {t}）$的遗憾。我们还使用$ \ tilde {o}（\ sqrt {kt} + n^2k）$遗憾的方法来补充有效算法。实验发现证实了与先前方法相比，我们有效算法的有效性。

translated by 谷歌翻译

Personalized Longitudinal Assessment of Multiple Sclerosis Using Smartphones

Oliver Y. Chén , Florian Lipsmeier , Huy Phan , Frank Dondelinger , Andrew Creagh , Christian Gossens , Michael Lindemann , Maarten de Vos

分类： (统计)机器学习

2022-09-20

个性化的纵向疾病评估对于快速诊断，适当管理和最佳调整多发性硬化症（MS）的治疗策略至关重要。这对于识别特殊主体特异性疾病特征也很重要。在这里，我们设计了一种新型的纵向模型，以使用可能包含缺失值的传感器数据以自动化方式绘制单个疾病轨迹。首先，我们使用在智能手机上管理的基于传感器的评估来收集与步态和平衡有关的数字测量以及上肢功能。接下来，我们通过插补对待缺失的数据。然后，我们通过使用广义估计方程来发现MS的潜在标记。随后，从多个培训数据集中学到的参数被结合起来形成一个简单的，统一的纵向预测模型，以预测MS在先前看不见的MS的人中随着时间的推移。为了减轻严重疾病得分的个体的潜在低估，最终模型结合了第一天的数据。结果表明，所提出的模型有望实现个性化的纵向MS评估。他们还表明，与步态和平衡以及上肢功能有关的功能（从基于传感器的评估中远程收集）可能是预测MS随时间推移的有用数字标记。

translated by 谷歌翻译

Predicting Performances of Mutual Funds using Deep Learning and Ensemble Techniques

Nghia Chu , Binh Dao , Nga Pham , Huy Nguyen , Hien Tran

分类：机器学习

2022-09-18

预测基金绩效对投资者和基金经理都是有益的，但这是一项艰巨的任务。在本文中，我们测试了深度学习模型是否比传统统计技术更准确地预测基金绩效。基金绩效通常通过Sharpe比率进行评估，该比例代表了风险调整的绩效，以确保基金之间有意义的可比性。我们根据每月收益率数据序列数据计算了年度夏普比率，该数据的时间序列数据为600多个投资于美国上市大型股票的开放式共同基金投资。我们发现，经过现代贝叶斯优化训练的长期短期记忆（LSTM）和封闭式复发单元（GRUS）深度学习方法比传统统计量相比，预测基金的Sharpe比率更高。结合了LSTM和GRU的预测的合奏方法，可以实现所有模型的最佳性能。有证据表明，深度学习和结合能提供有希望的解决方案，以应对基金绩效预测的挑战。

translated by 谷歌翻译

EMaP: Explainable AI with Manifold-based Perturbations

Minh N. Vu , Huy Q. Mai , My T. Thai

分类：机器学习

2022-09-18

在过去的几年中，已经引入了许多基于输入数据扰动的解释方法，以提高我们对黑盒模型做出的决策的理解。这项工作的目的是引入一种新颖的扰动方案，以便可以获得更忠实和强大的解释。我们的研究重点是扰动方向对数据拓扑的影响。我们表明，在对离散的Gromov-Hausdorff距离的最坏情况分析以及通过持久的同源性的平均分析中，沿输入歧管的正交方向的扰动更好地保留了数据拓扑。从这些结果中，我们引入EMAP算法，实现正交扰动方案。我们的实验表明，EMAP不仅改善了解释者的性能，而且还可以帮助他们克服最近对基于扰动的方法的攻击。

translated by 谷歌翻译

Robust Product Classification with Instance-Dependent Noise

Huy Nguyen , Devashish Khatwani

分类：自然语言处理

2022-09-14

大型电子商务产品数据中的嘈杂标签（即，将产品项放入错误类别）是产品分类任务的关键问题，因为它们是不可避免的，不足以显着删除和降低预测性能。培训数据中对数据中嘈杂标签的产品标题分类模型对于使产品分类应用程序更加实用非常重要。在本文中，我们通过比较我们的数据降低算法和不同的噪声抗压训练算法来研究实例依赖性噪声对产品标题分类的性能的影响，这些算法旨在防止分类器模型过度拟合到噪声。我们开发了一个简单而有效的深度神经网络，用于将产品标题分类用作基本分类器。除了刺激实例依赖性噪声的最新方法外，我们还提出了一种基于产品标题相似性的新型噪声刺激算法。我们的实验涵盖了多个数据集，各种噪声方法和不同的训练解决方案。当噪声速率不容易忽略时，结果揭示了分类任务的限制，并且数据分布高度偏斜。

translated by 谷歌翻译